AITopics | relu dnn

Collaborating Authors

relu dnn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Complexity of One-Dimensional ReLU DNNs

Kogan, Jonathan, Jananthan, Hayden, Kepner, Jeremy

arXiv.org Machine LearningDec-10-2025

Abstract--We study the expressivity of one-dimensional (1D) ReLU deep neural networks through the lens of their linear regions. We also propose a function-adaptive notion of sparsity that compares the expected regions used by the network to the minimal number needed to approximate a target within a fixed tolerance. Deep Neural Networks (DNNs) with Rectified Linear Unit (ReLU) activation functions are piecewise-linear functions whose expressive power can be studied via the number of linear regions that they create [1]-[3]. However, achieving such approximations for complicated functions typically demands substantial computational resources. The Lottery Ticket Hypothesis states that we can often remove many connections while maintaining similar performance, motivating the study of sparse DNNs [7].

breakpoint, denote, linear region, (14 more...)

arXiv.org Machine Learning

2512.08091

Country: North America > United States > Massachusetts (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Some Super-approximation Rates of ReLU Neural Networks for Korobov Functions

Li, Yuwen, Zhang, Guozhi

arXiv.org Artificial IntelligenceJul-15-2025

Introduction In recent years, Deep Neural Networks (DNNs) have emerged as a crucial technology in the fields of machine learning and artificial intelligence and achieved tremendous success. The remarkable success of DNNs has inspired numerous practical applications in computer science and engineering, including image recognition, natural language processing, autonomous driving, scientific computing, etc. Towards an explainable framework of machine learning, understanding the approximation power of DNNs is a fundamental question and remains an active research area.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Artificial Intelligence

2507.10345

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions

Fu, Guoji, Lee, Wee Sun

arXiv.org Machine LearningMay-19-2025

This paper studies the approximation and generalization abilities of score-based neural network generative models (SGMs) in estimating an unknown distribution $P_0$ from $n$ i.i.d. observations in $d$ dimensions. Assuming merely that $P_0$ is $α$-sub-Gaussian, we prove that for any time step $t \in [t_0, n^{O(1)}]$, where $t_0 \geq O(α^2n^{-2/d}\log n)$, there exists a deep ReLU neural network with width $\leq O(\log^3n)$ and depth $\leq O(n^{3/d}\log_2n)$ that can approximate the scores with $\tilde{O}(n^{-1})$ mean square error and achieve a nearly optimal rate of $\tilde{O}(n^{-1}t_0^{-d/2})$ for score estimation, as measured by the score matching loss. Our framework is universal and can be used to establish convergence rates for SGMs under milder assumptions than previous work. For example, assuming further that the target density function $p_0$ lies in Sobolev or Besov classes, with an appropriately early stopping strategy, we demonstrate that neural network-based SGMs can attain nearly minimax convergence rates up to logarithmic factors. Our analysis removes several crucial assumptions, such as Lipschitz continuity of the score function or a strictly positive lower bound on the target density.

log 2, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

2505.1088

Country:

Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)
(7 more...)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: Rational neural networks

Neural Information Processing SystemsJan-27-2025, 06:08:26 GMT

The paper studies rational DNNs --- deep neural networks where rational functions (of small degrees) are used as non-linearities. The paper provides many interesting theoretical results on the approximation properties of the rational DNNs (specifically, in comparison to ReLU DNNs). The paper also provides two experiments (learning the solution of the 2-dimensional PDE and applications in generative adversarial networks), which are meant to demonstrate that rational activations have advantages compared to other popular activations (ReLu, sine, tanh, polynomial, etc) when used in actual DNN training. The theory presented in the paper establishes that: (1) Consider two problems: (i) Approximating (in the uniform norm) a function implemented with the rational DNNs using ReLU DNNs; and (ii) approximating a function implemented with the ReLU DNNs using rational DNNs. Theorem 3 shows that (ii) is much easier than (i): (ii) can be solved to eps-precision with log(log(1 / eps)) many parameters, whereas (i) requires at least log(1 / eps) parameters (exponentially more).

dnn, experiment, rational dnn, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Review for NeurIPS paper: Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning

Neural Information Processing SystemsJan-23-2025, 11:23:50 GMT

I have increased my score. Original summary: The authors define two key properties of a ReLU DNN: the Neural Path Feature (NPF) and the Neural Path Value (NPV). The NPF encodes which paths are active and the input features associated with those paths.

artificial intelligence, machine learning, neural path feature, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

From Monte Carlo to neural networks approximations of boundary value problems

Beznea, Lucian, Cimpean, Iulian, Lupascu-Stamate, Oana, Popescu, Ionel, Zarnescu, Arghir

arXiv.org Artificial IntelligenceDec-4-2023

In this paper we study probabilistic and neural network approximations for solutions to Poisson equation subject to H\" older data in general bounded domains of $\mathbb{R}^d$. We aim at two fundamental goals. The first, and the most important, we show that the solution to Poisson equation can be numerically approximated in the sup-norm by Monte Carlo methods, { and that this can be done highly efficiently if we use a modified version} of the walk on spheres algorithm { as an acceleration method. This provides estimates which are efficient with respect to the prescribed approximation error and with polynomial complexity in the dimension and the reciprocal of the error.} {A crucial feature is that} the overall number of samples does not not depend on the point at which the approximation is performed. As a second goal, we show that the obtained Monte Carlo solver renders { in a constructive way} ReLU deep neural network (DNN) solutions to Poisson problem, whose sizes depend at most polynomialy in the dimension $d$ and in the desired error. In fact we show that the random DNN provides with high probability a small approximation error and low polynomial complexity in the dimension.

approximation, diam, proposition 2, (16 more...)

arXiv.org Artificial Intelligence

2209.01432

Country:

Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Europe > Spain (0.04)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Optimal Expressive Power of ReLU DNNs and Its Application in Approximation with Kolmogorov Superposition Theorem

He, Juncai

arXiv.org Artificial IntelligenceAug-10-2023

This paper is devoted to studying the optimal expressive power of ReLU deep neural networks (DNNs) and its application in approximation via the Kolmogorov Superposition Theorem. We first constructively prove that any continuous piecewise linear functions on $[0,1]$, comprising $O(N^2L)$ segments, can be represented by ReLU DNNs with $L$ hidden layers and $N$ neurons per layer. Subsequently, we demonstrate that this construction is optimal regarding the parameter count of the DNNs, achieved through investigating the shattering capacity of ReLU DNNs. Moreover, by invoking the Kolmogorov Superposition Theorem, we achieve an enhanced approximation rate for ReLU DNNs of arbitrary width and depth when dealing with continuous functions in high-dimensional spaces.

artificial intelligence, machine learning, relu dnn, (15 more...)

arXiv.org Artificial Intelligence

2308.05509

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Saudi Arabia (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rethink Depth Separation with Intra-layer Links

Fan, Feng-Lei, Li, Ze-Yu, Xiong, Huan, Zeng, Tieyong

arXiv.org Artificial IntelligenceMay-11-2023

The depth separation theory is nowadays widely accepted as an effective explanation for the power of depth, which consists of two parts: i) there exists a function representable by a deep network; ii) such a function cannot be represented by a shallow network whose width is lower than a threshold. However, this theory is established for feedforward networks. Few studies, if not none, considered the depth separation theory in the context of shortcuts which are the most common network types in solving real-world problems. Here, we find that adding intra-layer links can modify the depth separation theory. First, we report that adding intra-layer links can greatly improve a network's representation capability through bound estimation, explicit construction, and functional space analysis. Then, we modify the depth separation theory by showing that a shallow network with intra-layer links does not need to go as wide as before to express some hard functions constructed by a deep network. Such functions include the renowned "sawtooth" functions. Moreover, the saving of width is up to linear. Our results supplement the existing depth separation theory by examining its limit in the shortcut domain. Also, the mechanism we identify can be translated into analyzing the expressivity of popular shortcut networks such as ResNet and DenseNet, \textit{e.g.}, residual connections empower a network to represent a sawtooth function efficiently.

artificial intelligence, breakpoint, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2305.07037

Country:

Asia > China > Hong Kong (0.04)
Europe > Estonia > Harju County > Tallinn (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Side Effects of Learning from Low-dimensional Data Embedded in a Euclidean Space

He, Juncai, Tsai, Richard, Ward, Rachel

arXiv.org Artificial IntelligenceFeb-4-2023

The low-dimensional manifold hypothesis posits that the data found in many applications, such as those involving natural images, lie (approximately) on low-dimensional manifolds embedded in a high-dimensional Euclidean space. In this setting, a typical neural network defines a function that takes a finite number of vectors in the embedding space as input. However, one often needs to consider evaluating the optimized network at points outside the training distribution. This paper considers the case in which the training data is distributed in a linear subspace of $\mathbb R^d$. We derive estimates on the variation of the learning function, defined by a neural network, in the direction transversal to the subspace. We study the potential regularization effects associated with the network's depth and noise in the codimension of the data manifold. We also present additional side effects in training due to the presence of noise.

artificial intelligence, machine learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

2203.00614

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Asia > China (0.04)

Genre: Research Report (0.50)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ReLU Deep Neural Networks from the Hierarchical Basis Perspective

He, Juncai, Li, Lin, Xu, Jinchao

arXiv.org Artificial IntelligenceJul-1-2022

We study ReLU deep neural networks (DNNs) by investigating their connections with the hierarchical basis method in finite element methods. First, we show that the approximation schemes of ReLU DNNs for $x^2$ and $xy$ are composition versions of the hierarchical basis approximation for these two functions. Based on this fact, we obtain a geometric interpretation and systematic proof for the approximation result of ReLU DNNs for polynomials, which plays an important role in a series of recent exponential approximation results of ReLU DNNs. Through our investigation of connections between ReLU DNNs and the hierarchical basis approximation for $x^2$ and $xy$, we show that ReLU DNNs with this special structure can be applied only to approximate quadratic functions. Furthermore, we obtain a concise representation to explicitly reproduce any linear finite element function on a two-dimensional uniform mesh by using ReLU DNNs with only two hidden layers.

approximation, dnn, relu dnn, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.camwa.2022.06.006

2105.04156

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Pennsylvania > Centre County > University Park (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback